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Abstract 

In this paper we derive the asymptotic distributions of two distinct regular¬ 
ized estimators for functional canonical correlation as well as their associated 
eigenvalues, eigenvectors and projection operators. The methods we developed 
utilize regularized estimators which approach the functional operators based 
in reproducing kernel Hilbert spaces (RKHS) as the regularization parameter 
approaches zero. In addition to providing some justification for the RKHS 
methods, we explore the asymptotics of regularized operators associated with 
both Tikhinov and truncated singular value decomposition (TSVD) type reg¬ 
ularization. Together, these regularization methods represent two of the most 
commonly utilized forms of regularization. 

Keywords: Canonical Correlation; Asymptotic Distributions; Stochastic 
Processes; Reproducing Kernel Hilbert Spaces; Regularization; Inverse 
Problems 

AMS 2000 Subject Classification: Primary 62H20, 60E05, 62H25, 62M99, 
45B05, 45Q05 


1. Introduction 

The goal of multivariate canonical correlation analysis (MCCA; Hotelling 
[19]) is to identify and quantify the associations between two random vectors 
Xi G and X 2 G Recently interest has been focused on the exten¬ 

sion of this notion to the collection and analysis of “functional data” where 
the term refers to observations that are curves or sample paths of continuous 
time stochastic processes. Although development of statistical methodology 
for the analysis of functional data has been an active research area for well 
over twenty years, the current popularity of functional data analysis (FDA) is 
due, in large part, to monographs by Ramsay and Silverman [30] [29]. What 
separates functional data from ordinary multivariate data is that the observed 
data are sample paths from stochastic processes Xi(-) and X 2 {-), which are as¬ 
sumed to be elements of some inhnite dimensional and separable Hilbert space 
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consisting of functions defined on an index set E, such as [0,1] or Z. In this 
setting, the covariance matrices which are central to the development of the 
theory of MCCA are replaced by covariance operators of integral type. In this 
infinite dimensional case, some difficulties regarding the definition of the sample 
canonical correlation have already been observed in Leurgans et al. [25]. These 
authors argue that some kind of smoothing or regularization is indispensable 
when dealing with estimating the sample canonical correlation. The source of 
the difficulty in the functional data case is that the sample estimators for covari¬ 
ance operators have finite rank while, in principal, they operate in an infinite 
dimensional Hilbert space. Leurgans et al. [25] points out that as a consequence, 
the sample principal canonical correlation will always be 1 if no regularization 
or smoothing is done. This problem originates from the fact that when the 
number of time points at which the processes are measured becomes larger than 
the sample size, it will always be possible to find linear combinations of both 
processes which are perfectly correlated. From a functional analysis standpoint, 
the covariance operators involved in the analysis require regularization as they 
are Hilbert-Schmidt and thus do not possess an inverse (see e.g., Rynne and 
Youngson [33]). The situation involved with functional canonical correlation 
analysis (FCCA) is analogous, therefore, to the classic inverse problem of find¬ 
ing approximate solutions to equations involving Freidholm integral equations. 
Much like this classic problem, regularization plays an instrumental role in it’s 
resolution. 

This paper is organized as follows. In Section 2 we introduce the notations, 
definitions and assumptions which we will utilize throughout the paper. In 
this section we will also discuss why reproducing kernel Hilbert space (RKHS) 
methods is the ideal Hilbert space to solve the functional canonical correlation 
analysis (FCCA) problem. In Section 3 we will introduce the notions of canon¬ 
ical correlation and discuss why the Eubank and Hsing [15] approach to FCCA 
provides most complete definition to canonical correlation analysis without reg¬ 
ularization. In Section 4 we will discuss the general theory associated with 
regularization and introduce both the Tikhinov and truncated singular value 
decomposition (TSVD) types of regularization (Engl et al. [14]). In Section 5 
and 6 we will discuss the consistency and asymptotic distributional theory as¬ 
sociated with Tikhinov regularized canonical correlation operators, and in Sec¬ 
tions 7 and 8 we will do the same with TSVD regularization. Einally, Section 
9 will be devoted to summarizing our conclusions and providing some further 
recommendations. 


2. Basic notation, definitions and assumptions 

Let if be a subset of K and v a sigma-finite measure on E. We then con¬ 
sider the case where a stochastic process {X{t),t G E} takes values in the 
Hilbert space H = L'^{E) of square integrable functions on E with inner prod¬ 
uct = /g f{t)g{t)dv{t). Throughout it will be assumed that 

E|lA|l^<cx). (1) 


2 


Under this assumption, E(X,/)^ = {x, f)^dP{x) < oo for all / e H with 

P denoting the induced probability measure oi X on H. The Riesz-Frechet 
representation theorem then ensures the existence of an element fi € P such 
that E(X,/) = (^,/). Under assumption (1), the Riesz-Frechet representation 
theorem also ensures the existence of the covariance operator S which 

is given by 


E[(/, g)] = E[(/, (X - m) - m)5)] = (/, Sg) (2) 


where is the tensor product in H and is defined by {f g)h = (/, h)-^g for 
all f,g,h G P. We may also write S = E[(X — g) {X — g)]. It is also well 
known that the covariance operator S is self-adjoint, non-negative definite, and 
has finite trace (see Laha and Rohatgi [33]). The hnite trace property ensures 
that S is Hilbert-Schmidt and hence compact. 

For any abstract Hilbert spaces M and N let denote the Banach 

space of all bounded operators that map M. to N. A subclass of is 

X{A4,Af) which will denote the set of all compact operators that map Ai to 
Af. Of particular importance in this paper is the subclass of compact operators 
which have finite trace, known as Hilbert-Schmidt operators. Let ICHs{A4,Af) 
denote the set of all Hilbert-Schmidt operators that map Ai to Af. In this 
paper we will use the simplifying notation that B{A4) = B{AA,AA), K,{AA) = 
K,(A4 , A4 ) and IChs(AA) — JChs(A4,A4). The ordinary operator norm on B(AA ) 
will be denoted by || • ||. The set of Hilbert-Schmidt operators K,hs{AA) becomes 
a separable Hilbert space when it is endowed with the inner product 


OO 


(Aefe,Hefe)^ =tr(A*H), A,B&Xhs{M) (3) 


with denoting any complete orthonormal system (CONS) for AA. This 

inner product does not depend on the choice of basis (Kato [22]). The inner 
product, norm and tensor product on 1Chs{AA) will be denoted by (•,-)^ 5 , 
j] • Whs and i^hs respectively. 

Next, assume that Pi and P 2 are two closed subspaces of P such that 


P=Pl®P 2 , P 1 -LP 2 


and let T^, i = 1,2 denote the orthogonal projection operator of P onto Pi 
for z = 1,2. Suppose further that Xi = T^A, gi = Pig, and denote the 
restriction of S to Pi and Pj for i,j = 1,2 so that Sij = TjSTi. Because the 
Ti are bounded and S is Hilbert-Schmidt, the Sij for z, j = 1,2 are also Hilbert- 
Schmidt and compact. In addition, the Sa are self-adjoint and non-negative 
definite. For convenience, we henceforth denote Su = Si. 

For z = 1,2, let {(l)in}?fLi be an orthonormal basis corresponding to eigen¬ 
vectors of Si with {Xin}'^:^i, corresponding sequence of non-negative eigen¬ 
values. Since Si is self-adjoint, non-negative and compact we may write 


OO 



( 4 ) 


n—1 
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with Ail > Ai 2 > • • • > 0 a decreasing sequence whose only limit can be zero. 

For our purposes we might as well assume without loss of generality (WLOG) 
that {(j)in}^=i is a CONS for Hi, Si is strictly positive and Hi — ker(S'i)-*- for 
i = 1,2. We make this assumption since if S ker(S'i) then Var[((p, W)-^] = 
{(p,Si(p)y^ = 0, which would have the consequence that {ip,Xi).^, = {fj,i,ip)y^ 
with probability one. It is also convenient at this juncture to assume, WLOG, 
that the mean of the process is zero because if this does not hold we may 
always consider the covariance of the process X(-) — instead. It should 
be mentioned that in (4) the list of eigenvalues {A™} is repeated according to 
their multiplicity. An alternative expression for (4) involving eigenprojection 
operators is 

OO 

5, = ^ XihPih: for z = 1, 2 (5) 

h^l 

where {Ai/j} are the distinct elements of {Ai„}, and Pih is the finite dimensional 
projection operator onto the eigenspace associated with each distinct Xih given 
by 

Pih — ^ ^ 4^in ^T-Li 4^in- ( 6 ) 

Since the processes {Xi{-)}hi are of second order, they admit a Karhunen-Loeve 
expansion Xi(-) = Yl^=i^in4’in{‘) with the random variables Zin defined by 
Zin = {Xi,(pin)-^. (see Ash and Gardiner [3] or Doob [12]). These variables are 
orthogonal in the sense that Cov[Zij, Zik] = {4>ij, Si4>ik)-^. = XijSjk with 6jk 
denoting the Kronecker delta function. Mercer’s theorem then ensures that the 
covariance functions of the processes {Xi{t),t € Ei}^^.^ are 

OO OO 

Ku{s,t) = E[X,{s)X,{t)] = EE Ei[ZijyiZin](j)inis)4'im{t) 

n—1 m—1 

OO 

= y] Xin(l>in{s)(t>in{t) for i = l,2. (7) 

n—1 

Moreover, the cross-covariance kernel is then 


OO OO 

Ki2(s,t) = E[Wi(s)A2(t)] = y] y] E[Zi^Z2m]Ms)‘/>2m(t) 

n—1 m—1 

OO OO 

= EE qmn^ln (5)^2771 (^) 

n—1 m—1 


( 8 ) 


and we note that Ki2{s,t) = K2i{s,t). For notational simplicity let Kii(s,t) = 
Ki{s,t). It is also well known that for all f G Hi 


{Szjf)it)=[ Kij{s,t)f{s)diy{s) = {Kij {■,!), f)y^, ior i,j 
J E 


= 1 , 2 . 


(9) 
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An alternate form for the cross-covariance operator S 12 : H 2 >—>■ 'Hi is given by 

00 00 

'Tmn02n(s) ®'Hq, — '^21* 

n—1 m—1 

In addition to'H — ker(5')-'- C L?[E), two additional types of Hilbert spaces 
will play prominent roles in further developments. The first type of Hilbert space 
are the reproducing kernel Hilbert spaces (RKHS) associated with the symmet¬ 
ric covariance kernels Ki{-^ *), denoted ^{Ki) (see Aronszajn [2] or Berlinet and 
Thomas-Agnan [4]). The second type are the Hilbert spaces generated by each 
stochastic process, denoted L|-. with i = 1,2 (see Parzen [28]). To construct 
both of these Hilbert spaces we first let {fi,... be any finite collection of 
points in E and let = [Xi{ti),..., X,{tn)]' with K„ = 4)}yfe=i 

denoting the covariance matrix of Xi„ for each n £ N. Next we define the 
pre-Hilbert space generated by the process to be the set of all arbitrary finite 
dimensional linear combinations of the process, i.e. = {U = a'Xi„ : a £ 

ker(Km)-*- C R"} where the inner product between two elements is given by 

(a'X„, b'X,„)^^ = Cov[a'X„, b'X„] = a'K„b. (11) 

Likewise, the pre-Hilbert space of the RKHS is defined to be the column space 
of Ki„, i.e. 'H{K.in) = {f = Ki„a : a £ ker(Ki„)-*- C M"} and the inner product 
between any f = Ki„a and g = Ki„b is given by 

(f>g)«(K.„)^f'KLg = a'K,„b (12) 

where K.1^ denotes the Moore-Penrose inverse of Kj„. The Parzen-Loeve con¬ 
gruence mapping is determined uniquely by •)) = Xi{t) for each t G E 

with the result that every linear combination U of the X vector with nonzero 
variance can be expressed as 

U = «-(f) = f'K„X„ (13) 

for some f £ 'H(Ki„) (see King [20]). It is a simple matter to see that inner 
product given by (12) satisfies the reproducing property (see Aronszajn [2]). To 
see this let k be any index in 1,..., n and let •) = K^(-,tfc) denote the 

row of K„. Now for any f = K^a £ 'H(Ki„) we have that 

(K„(-, 4), = Kin(4, •)K|„K„a = Ki„(tfc, -(a = £(4). 

This demonstrates that the pre-Hilbert space 'H(K„) given by inner product 
defined in (12) must be the unique RKHS of the process {X{ti)}2^^. To complete 
the construction of 'H{Ki) and then extend the realm of the pre-Hilbert 

spaces, which presently apply to any finite collection of points {ti,..., tn} £ E 
to the index set, E in its entirety. This construction is accomplished through 
Cauchy completion or adding in the limits of arbitrary linear combinations of 
the form and aiX{tj). In this fashion, we see that 

nK^) = {/ : /(•) = f K{-,t)a{t)dHt)} = = ker(5,)^ (14) 

Je 
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and 


= {{7 : t/ = / a{t)X{t)diy{t)} (15) 

J E 

with A, denoting the closure of any set A. In this infinite dimensional setting 
the RKHS is the set of function on E given by 

OO OO 

nm = {/ : /(•) = II/IIh(k.) = < 00 } (16) 

1=1 1=1 

where generalized Fourier coefficients relative to the 

CONS for Hi, i = 1,2. An application of the integral representation 

theorem of Parzen [28] then produces the following result. 

Theorem 2.1. For i = 1,2 let /(•) = be in H{Ki). Then, 

OO / OO \ OO 

^i{f) = E! and ^ I E! j ~ E! (17) 

1 = 1 \3 = l J 1 = 1 

with Zij = {Xi, and = iIi*, where dl* denotes the adjoint of'^i. 

The importance of the RKHS inner product when formulating theory re¬ 
garding integral operators was shown by Nasheed and Wahba [27]. These au¬ 
thors provided a characterization of the RKHS H{Ki) generated by the kernel 
Ki to closure of the image of the integral operator for the symmetric square 

root, Im(S'T ). In this regard, first notice that since the Si are positive (and 
self-adjoint), they have symmetric square roots ' with associated symmetric 
kernel $i(s,t) given explicitly by 

OO 

^i{s,t) =^\\(‘^(j)ij{s)(j)ij{t), 1 = 1,2. (18) 

1=1 

For 1 = 1,2 the symmetric kernels $i(s,t) satisfy 

K^{s,t)= f $i(s,r)$j(t,r)dz/(r). (19) 

J E 

We further note that 

Im(S'j^^^) C Im(S'y^) = ker(S'y^)-‘' = ker(S'i)-‘' = Hi. 

Nasheed and Wahba [27] then arrive at the following important theorem. 

Theorem 2.2. (Nasheed and Wahba, 1974) For i = 1,2 the RKHS H{Ki) 
consist of functions of the form 

/(•) = f g{s)^^{■,s)dly{s) 

J E 


6 





for some g G Hi- The inner product in ^{Ki) is 

(/ij= ( 51152 )-^^ ( 20 ) 

where 51,32 G TLi are the minimal LF‘{E) norm solutions of 

fji-)= [ 9jis)<^ii-,s)dty{s), j = l, 2 . 

J E 

Proof: For i = 1,2, let Vi be the smallest closed subspace of Hi that contains 
for all t G E. Since the smallest linear space containing ^i{-,t) for all 
t G E is span{$i(',t) : t G -B}, it follows that V = span{$i(',t) : t G E}. Now 
the projection theorem ensures that for each / G H{K) there exists a unique 
element gj G V of minimal norm which is the best-approximate solution to the 
inverse problem 

fit) = iSl^'^9f)it) = f 5 /(s)$*(s,t)di/(s), Vt e E. 

J E 

Because 5 / is unique, the inner product given by (20) and associated norm are 
well defined. We now only need to show that Ki(-, •) are the reproducing kernel. 
However, 

= ($i(*,-),^>*(*,t))«^- 

Thus, by (20), 

m;t)J)niK,) = {Uvt),9fi*))n, = iSl^"9f)it)=fit). 

0 

This theorem shows that for i = 1,2 the optimal Hilbert space to solve 
inverse problems associated with integral equations of the form / = S'd 5 is in 
the RKHS setting H{Ki). To illustrate this, consider the problem of finding a 
function g{-) to satisfy S^^g = /g 5 (s)$i(-, s)dv(s) = /(•) for some given /(•) S 
Hi- A least-squares solution to this problem is a minimizer of ||Sd g — fWui 
and a best least-squares solution is the one with minimum norm. If we let 
Fi = (ImS^^) © (ImSy^)-*- and assume f G Fi, g is a least squares solution if 
and only if = S\^'^ f = Sg. Furthermore, the unique best least-squares 

solution is given by 5 = {Si)tsj^^f = denoting the Moore- 

Penrose inverse of S',' , and no least-squares solution exists if / ^ . However, 

from Engl et al. [14], f G Fi it and only if / satisfies the Picard criterion 

(21) 

and, in that case, 

j=l 
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Note that / S if and only if < oo. Consequently, 

7i{Ki) = (ImS'^^) = ker(5'i)-*', under the inner product 

with 

k=i 

for z, j = 1,2. 

For further developments, a congruence which connects Hi to ^{Ki) must 
be established. 

Corollary 2.1. (Eubank and Using, 2008 ) Fori = 1,2 the Hilbert spaces Im(5'|^^) 
ker(S'i)'*~ and 'H{Ki) are congruent under the mapping Fi : Hi i—>■ H(Hi) defined 
by 

OO 

(rig)(') = (22) 

i=i 

where g = 4>ij)-H-4‘ij = l^jLi 9ij4>ij G ker(S')"*~. The inverse mapping 

OO 

(F-V)(-) = Ev^/*^<(>*^(-) (23) 

i=i 

for f = Kjfiifiiii.-) G TL{,Ki), is also the adjoint o/Fj. 

1 /2 

Note that for z = 1, 2 the operators Fi and are equal in the sense that 
for any / e H^, F/ = y/Kjfijffij = The difference is in terms of 

the norm and inner product for the range of each operator. 


3. Canonical Correlation 

The literature on functional canonical correlation can be roughly dichotomized 
into formulations involving Hilbert space valued processes in Hi = ker(S'i)-'- (see 
He et al. [17] [18]) and an alternative approach that relies on reproducing kernel 
Hilbert space (RKHS) theory (Eubank and Hsing, [15]). In this section we will 
compare and contrast these two different approaches to functional CCA. In the 
He et al. [17] approach the squared canonical correlation pi and associated 
weight functions fk and gk are found by the singular value decomposition of the 
cross-correlation operator of Xi and X 2 defined by 

R = (24) 

1 / 2 t 1/2 

where S^ denotes the Moore-Penrose generalized inverse of S'/ for z = 1,2 
and is given explicitly by 

OO 

sE'=j:kxp„. 

h^l 
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(25) 






The right hand and left hand eigenvectors are then found by eigenvalue and 
eigenvector analysis of the operators RR* and R*R. The basic problem is that 
unlike the usual situation in the finite-dimensional case, the square roots of 
covariance operators of infinite dimensional Hilbert space valued processes are 
not invertible. To resolve this issue, He et al. [17] restricts the domain of 
{'Hi,'H 2 } to the subspace where the Moore-Penrose inverses of S\^‘^ and 
can be defined. Thus, for i = 1,2, the domain of S'/ is restricted to Fi = 
{S^^h : h e ker(S'i)'*-} and is characterized as the set of functions satisfying 
the Picard criterion (21) (see Engl et al. [14]). Now, subject to the restriction 
that the domain of i? be Ej, let pf > pi — ' ’ ’ — 0 denote the eigenvalues 
of R*R with pi,p 2 ,--- G F 2 the corresponding eigenvectors. The left hand 
eigenvectors are obtained by fk = Rgk jPk S Fi. The canonical correlations and 
weight functions are {pk,Uk = fk,Vk = and the corresponding 

canonical variables are {Uk = {uk,Xi)^^,Vk = {vk, In the He et 

al. [17] method the weight functions are not well defined whenever fk ^ Fi or 
9 k ^ F 2 - 

In contrast to the He et al. [17] method, the approach of Eubank and 
Hsing [15] involves the singular value decomposition of the RKHS based operator 
T : 'H{K 2 ) HiKi) defined such that for any g G 'H[K 2 ) 

{Tg){s) = {K^2{s,-),~9)n(K.y ( 26 ) 

Let {Pfe,Pfc}^i denote the eigenvalues and eigenvectors of T*T and {p\, 
denote the eigenvalues and eigenvectors for TT*. Then 

00 

^ ^ Pi 9 k ®'H(K2) fk- (27) 

k=l 

The canonical correlation is pk and {/fc,Pfe}^i are the canonical weight 
vectors in {'H[Ki),'H{K 2 )}- These canonical weight vectors correspond to the 
canonical variables 'l'y(pj)}^]^ that represent the maximally correlated 

elements of L\^}. 

The relationship between T and R was established by Eubank and Hsing 
[15] and can be simply derived by substituting the expression for K12 given by 
(8) into (26). It follows that for any g e H{K2), 

00 00 

(Ts)G) = Y,T, 'ljk{'f 2 k, 9) 

i = l k=l 

00 00 

= 'y^,iPjky^^lj^2k){4>2k,g)'k/(K2)f^iji^) ( 28 ) 

i = l k=l 

with p,fc = Now, since are CONSs for ker(S'i)-*-, it follows 

V AijA2fc 

that {(fik = Ti(j)ik = VKk4'ik}'kLi CONSs for As a result, the 
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operator T may be written as 


T 


EE Pjk^2k ®H{K2) 

j = l k=l 
oo oo 

EE Pjk[(^24'2k) ®'H{K2) (ri<^lj)] 

j = l k=l 
OO oo 

EE Pjk[{4’2k^2 (^101^)] 

j = l k=l 


EE Pjk[(t>2k ®H2 

j-1 k=l 

riijrr^ = riS'y^^s'izS'o^'^^rr ^ 


n-1 


Because r 2 is a bijection, ^Tr 2 has the form 


hi ^Tr2 — E E 

3 = 1 k=l 


(29) 


and the domain is ker(S' 2 )‘'‘- By contrast, if we utilize the He et al. [17] method 

and restrict ourselves to the domain F 2 = Im(S'y^) C Im(S'y^) = ker(S' 2 )'*‘ 
then, on this restricted subspace of ker(S' 2 )'*', 


R = S32Sl^^^ = E E PF ihk ^ 13 ) = ^I^TT2\f2- 

j=l k=l 


Since Fi and F 2 are unitary, T is unitarily equivalent to R and the two meth¬ 
ods agree when both methods are well defined. The differences between the 
approaches can be briefly summarized by the fact that in the He et al. [17] 
approach the domain of R must be restricted to F 2 = (ker(S' 2 )'*‘), which is 

a dense proper subset of ker(S' 2 )''‘ in the infinite dimensional case. By contrast, 
the domain of F]"^TF 2 in Eubank and Hsing [15] approach is all of ker(S' 2 )''', 
since the mapping F 2 is a unitary bijective mapping from ker(52)''' 1 —t R{K 2 ). 
Therefore the Eubank and Hsing [15] approach is the more comprehensive defi¬ 
nition while the He et al. [17] approach can have non-attainable solutions on the 

boundary |^Im(S' 2 ^^) \Im(S' 2 ^^)y This reveals the advantage of RKHS based 
formulation and we will therefore consider asymptotics associated with the reg¬ 
ularized approximations to TT* and T*T rather than RR* and R*R in this 
paper. 


4. Regularization 

The need to employ some form of regularization in the functional data anal¬ 
ysis setting is well established on both theoretical as well as computational 
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grounds by many authors. For example, it was perhaps Leurgans et al. [25] 
who first observed that the sample covariance operator of a stochastic process 
has a finite dimensional kernel (Riesz & Sz.-Nagy [32]), while acting on an 
infinite dimensional space. Cupidon et al. [6] then showed how most of the defi¬ 
ciencies of the population canonical correlation can be remedied if a regularized 
approximation to the inverses of the covariance operators are involved. 

If S G /C("Hi, 7 ^ 2 ) is arbitrary and we are given g G 772, it often happens 
that we are asked to solve the equation Bf = g. If {^dn,is the 
singular system for B so that 

rank(B*B) 

B = ^ ^ 

n—1 

and g G Im(i7) 0 Im(i7)-’', then it is well known that a unique best approximate 
(least squares) solution /* exists and is given by 


rank(B* B) 

/* =B^g= Y. 

n—1 


{y, 

Pn 


4^n- 


For a compact operator B, Bf = g is often ill-posed (see e.g.. Theorem 2.14 of 
Vogel [34]) and attempts to directly use B^ will result in numerically unstable 
algorithms. The standard approach to dealing with this problem is to replace 
Bi with a family of so called regularization operators 77(a) : 772 >—>• 77i that are 
indexed by a regularization parameter, a G (0, a) C M, with a > 0. The family 
{77(a) : a G (0, a)} approximates 7?f in the sense of the following definition (see 
Vogel [34] p. 22-23). 

Definition The family {77(a) : a G (0,a)} is a regularization scheme which 
converges to B^ if 

(i) for each a G (0,a), 77(a) is a continuous operator and 

(ii) given any g G Im(77), for any sequence {gn} C 772 which converges to g, 
one can pick a sequence {a„} C (0,a) such that 

[77(a„)] (g„) ^ B'^g as n —>■ 00 . 


Of particular interest are linear regularization schemes which have singular value 
representations as 

rank(B*B) 

77(a) — / . o ®'H 2 Vn 

n=l 

where Wa{/3^) is a real valued function of the squared singular values and a 
is such that Wa{Pn) —>■ I as a —>■ 0. The function Wa{Pn) is called the filter 
function (see Engl et al. [14]). Two of the most popular examples for filters are 


WciPl) 


l/dnl 

|/3n| + Oi' 


for a G (0, 00 ) and n = 1,..., rank(7?*77) 


(30) 
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and 


= I Q for a G (0, ||S||] and n = 1 ,... ,rank(S*B). (31) 

Equation (30) is referred to as the Tikhinov filter function and (31) is referred 
to as the truncated singular value decomposition (TSVD) filter function. In the 
case of TSVD regularization, the parameter a in (31) determines the cut-off or 
threshold level for the TSVD regularization and produces 

= X! (t>n 

which is a finite rank operator whenever a > 0. This paper will focus on 
asymptotics associated with Tikhinov and TSVD regularization schemes. In 
developments which follow a Hilbert-Schmidt operator B : Hi i—>■ H 2 will often 
be expressed in the form 

00 00 

^ = EE ^jk^j ^'Hi (^^) 

1 = 1 k=l 

with {0k} CONSs for Hi and H 2 , respectively. As B G /Cffs('Hi,^ 2 ) the 
coefficients Cjk G M will satisfy 

00 00 00 00 00 

\\b\\hs = E \\Y^'jk^k\\'H2 = EE'^ifc < 

j — l j—l k—l j — 1 k—1 

The regularized operator B{a) : Hi i-G H 2 will also be a Hilbert-Schmidt oper¬ 
ator and have the form 

00 00 

i^(«) = EE 

1 = 1 k=l 

with ||i3(Q!)|||f^5 = X^^i^i '^ii(ci) < property Cjk{oi) —1 Cju as a I 0. 

As a result, the following theorem holds. 

Theorem 4.1. Suppose B,B{a) G V_f/s('Hi,"^ 2 ) are of the forms (32) and 
(33), respectively. If Cjk{a) —>■ Cjk as a —1 0, then ||H(a) — B\\ -G 0. 

Proof: Let A{a) = B{a) — B and ajk(a) = {cjk{a) — Cjk) so that 

00 00 

^(«) = E E ^jk{a)(l>j 0k. 

1 = 1 k=l 

Since A{a) is Hilbert-Schmidt, ||A(a)|||^g = permiss- 

able values of the regularization parameter a. Consequently, 

00 

lim P(a)||ffs = lim Y 

a —^0 Q —)-0 ^ 

j,k=l 

00 

i.fc=i 
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The exchange in the order of limits and the sum in (34) is permissable by the 
Lebesgue dominated convergence theorem since the summands satisfy < 

2(c|fc(a) + c|fc) and “ 

^Wfis ~ ll^(o^)lllfS —>■ 0 as a 4, 0, it follows that B{a) converges to B in operator 
norm. <0> 

5. Tikhinov Regularized Canonical Correlation 

In the Tikhinov regularized approach to canonical correlation we replace the 
operators with {{Si + and then let 

R{a) = {Si + aI)-^/^Si2{S2 + (35) 

approximate the cross-correlation operator i? for a G (0, a). Since the operators 
{{Si+al)~^^^, {S 2 +aI)~^^‘^} are bounded and 5'i2 is Hilbert-Schmidt, it follows 
that R{a) is Hilbert-Schmidt. Now if we define T{a) : H{K 2 ) i->- ^{Ki) by 

T{a) = TiR{a)T^^ = ri{Si + aI)-^/^Si2{S2 + a/)-^/^^-! (35) 

then as the regularization parameter a 0, 

R{a) = {Si + aIi)-^^^Si 2 {S 2 + 

rank(Si)rank(S2) 

= XI /(I I Yj 

j^l vi^lj + + «) 

rank(Si)rank(S2) 

^ X X /7'^\ Yk <»l^e 2) Yj 

j^i k^i 

= r-iTr2. (37) 

Therefore, by Therein 4.1, R{a) converges in operator norm to the operator 
r-ij^Ti as a 4, 0 and the continuity of Ti and r2 ensures that 

T{a) =TiR{a)rY ^ T 

as a 4^ 0, with convergence in terms of operator norm. 

We will now show that the regularized canonical correlations along with 
the regularized canonical variables converge to the canonical correlation and 
variables defined from the Eubank and Hsing [15] methodology as a 4^ 0- In this 
regard, suppose that {p/c(a),/fc(a), 5/c(a)}^i is the singular system for R{a) 
such that 

00 

-R(a) = X! {9k{a) ®H2 fk{a )]. 

/c=l 
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Then, 


T{a) = T,R{a)T^^ = ^ pk{a) [(gfe(a)r;) (Ti/fc^)] 

k^l 
oo 

= '^Pk{a) 

k^l 

where fk{a) = Tifk{a) G HiKi) and gk{a) = r 2 g/c(a) G 'H{K 2 ). Now by (16) 
the canonical weight functions may be written as 

OO OO 

fk{ot) = X! ^ijfkj{a)(t>ij and gk{a) = ^ X 2 jgkj{a)(j) 2 j 
i=i i=i 

with 

fkjia) = {fk{a),(l)ij)m and gkj{a) = (gfe(a), 

Utilizing Theorem (2.1) the corresponding regularized canonical variables in 
and are then 


gfe(a) <^'H{K2) fk{a) 


rank(Si) 

Uk{a) = 'ii^{fk{a))= ^ fkj{a){Xi,(j)ij)^^ and 

rank(S2) 

Vfc(a) = d’2igk{a))= ^ gkj{a){X2,4’2j)-H2- 

i=i 

The continuity of the congruence mappings dfi and '£'2 ensures the conver¬ 
gence of the regularized canonical variables Uk{a) = 'd>i{fk{a)) and 14(a) = 
'^ 2 igk{<^)) to the true canonical variables, provided that the regularized canon¬ 
ical weight functions fk{c() G and gk{o() G 'H[K 2 ) converge to the true 

canonical weight functions fk and gk as the regularization parameter tends to 
zero. Thus, our tasks are to establish convergence of Pfc(a) to pi and of the the 
regularized RKHS functions {fk{oi),gk{ct)} to {fk,gk} for all A: > 1. Concerning 
the convergence of the eigenvalues p1{a) and the corresponding eigenprojection 
operators we have the following result. 

Theorem 5.1. Let {pHa), Pk{a)} and {Pk^Pk} denote the eigenvalues and cor¬ 
responding eigenprojection operators for T(a)T*(a) and TT*. The following 
limits hold as a 0 

0 < Pfe(a) t Pfc < 1 as a i 0 for all k > 1 and (38) 

\\Pk{a) - PkW < \\Pkia) - Pk\\HS ^ 0. (39) 
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Proof: First note that pl{a) < < 1 since ||T(q!)|P < ||T|p < 1. (see 

Proposition A.3 of Eubank and Hsing [15]). Now to see that (38) holds, fix 
k > 1. Since pl{a) and are the eigenvalues for T{a)T*{a) and TT* 

{pI - pUc^)) = \pI - Pfe(«)l < \\TT* - T{a)T*{a)\\ i 0 


as a 0. In order to show that Pk{a) —>■ Pk in operator norm, let Tj.^k be 
a circle centered at pi with radius r chosen so that Pr.fc encloses pi and no 
other eigenvalues of TT* . Suppose that {R{a,z),R{z)} are the resolvents of 
{T(q!)T*(q!), rr*}, respectively. Since ||T(q!)T*(q!) — TT*|j —0 as a 0, 
it follows from Theorem 10.1 in the appendix that there exists ao > 0 such 
that whenever 0 < a < ag, Pr,* encloses pl{a) and no other eigenvalues of 
T(a)T*(a). Furthermore, for any e > 0 we may take ao to be sufficiently small 
to ensure that |jr(a)r*(a) — TT*|| < e. Relation (92) from the appendix then 
has the consequence that 


\\Pk{ct) - Pk\\HS < r sup 


\\Tia)T*ia)-TT*\\Hs\\R{z)rHs ] 

l-\\T{a)T*{a)-TT*\\Hs\\R{z)\\Hs}' 


Thus, if M(r, k) = sup^gr^ k ll^('2^)ll e > 0 are chosen so that e < 2 M(r fc) ^ 

\\Pk{a)- Pk\\HS < 2rM‘^{r,k)e and hence |jPfc(Q;) - Pfcjj < ||P/c(a)-Pfejlffs -t 0 
as a I 0. <) 

It remains to show that for fc > 1, fk{a) = Ti{fk{a)) and gk{c() = r 2 (g(a)) 
approach fk G ‘H{Ki) and gk € R{K2) from the singular system {pk, fk, gk} of 
T. We note, however, that eigenvectors associated with any operator are not 
defined uniquely. For example, if 0 is an eigenvector for an arbitrary self-adjoint 
operator A, then —6 is also an eigenvector. In order to properly establish what 
we mean by convergence assume, WLOG, that for all a > 0, fk{ct) be chosen 
so that {fk{ct), fk)'ki{Ki) — O’ with a similar convention applied to gk[oi). The 
theorem below concerns convergence in the case that the eigenspaces associ¬ 
ated with {/fe,5fc} are 1-dimensional. Subsequently, we will discuss the higher 
dimensional case. 


Theorem 5.2. Assume that the eigenspaces associated with the eigenvectors fk 
and cjk are one dimensional with fc S Z. Then, as a j, 0 

ll/fe(a) -/fcl!«(ifi) ^ 0 and Whi^) - 9k\\'H{K2)0- 


Proof: For fixed fc G Z, it suffices to show that \\fk{c() — fk\ h{Ki) —t 0. Since the 
eigenspaces are one-dimensional it follows that Pk{a) = fk{ot) ^niKi) fkia) 
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. Now, notice that 


and Pk 


fk fk 


\\Pk{a) - Pk\\HS = {Pk{a) - Pk, Pk{a) - Pk)HS 
= 2 — 2{Pk{a), Pk)HS 

= 2 — 2{fk{ct) ®-h(Ki) // c ( o )) fk ®'H{Ki) fk)US 
= 2 — 2{fj.{a), fk)'n(Kx) 

— 2(1 ~ {fk{<P), /fc)?i(_R'i))(l + {fk{oi), fk)f{(^Xi)) 

= Wfkia) - fk\\HiK,)i^ + {fk{a)Jk)H(K^))- ( 40 ) 

Furthermore, as {fk{a), fk)u(Ki) ^0 —^ (1 + {fk{a), fk)H(K^)) ^ 1> hence 
ll/fe(a) - /fc||«(ifi) < \\Pk{a) - PuWhs- 

Since \\Pk{a) - Pfc|i//s 0 as a i 0, it follows that ||/fe(a) - hWlx^K^) 0. 0 

It shouldbenoted that if (/fc(a),< 0 instead, tYien {l-{fk{a), fk)x.{Ki)) ^ 
1 and from (40) we would have 

\\Pk{(x) — PkWllS = ‘^{^P{fk{oi),—fk)-ki{Ki))i^~ifk{^)!~fk)-ki{Ki)) 

— (1 ~ {fk{(P), fk)'ki(Ki))\\fk{^) ~ ^~fk)\\'H(Ki)- 

Hence, 

\\fk{a)-{-fk)\\li{K^)<\\Pk{oi)-Pk\\HS^^ as aiO 

and fk{o) would converge to {—fk) instead. 

When the eigenspaces have dimension larger than 1, it is possible to find 
infinitely many eigenspace invariant rotations 0 G B{fH{Ki)) so that /(, = 0/^ 
is still an eigenvector of TT* with eigenvalue yet \\fk{ct) — f'kW ^ 0 as a ), 0 
(see Kato [22] p. 98-100). 

Theorems 5.1 and 5.2 ensure that when T is simple, the singular system 
{Pfc(a)./fc(a).5fc(a)} of T{a) converges to the singular system {pk,fk,9k} of 
T as the regularization parameter a 0. This is a positive development pro¬ 
vided the singular value decomposition of T{a) can be estimated. However, the 
singular value decomposition of T{a) entails the eigenvalue-eigenvector decom¬ 
position of e.g., the operator 

ri(a) = T{a)T*{a)=TiR{a)R*{a)T^^ 

= Ti{Si+aI)-^/^Si2{S2 + aI)-^S2i{Si+aI)-^/^T-\ (41) 

Since, Fi is unknown in (41) we might estimate it using 

m I - 
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with {Xiin, Piin}, the estimated eigenvalues and corresponding eigenprojection 
operators for Sin and m some integer. This raises the question of how to select 
m and, for large m, Tin{m) is approximately whose compact nature is 
what prompted us to regularize from the beginning. 

Since we are already utilizing Tikhinov regularization, a possible remedy for 
our problem is to replace Ti with {Si + This produces the operator 

Si{a) = Si 2 iS 2 + aI)-^S 2 i{Si + al)-^ (42) 

whose domain is ker(5'i)-*- rather than T-L{Ki). One advantage of 5i(a) is that 
Im(5i(a)) C Im(S'i) C Fi and hence the eigenfunctions of iSi(a) satisfy the Pi¬ 
card criteria. To see this, note by the infinite dimensional extension of the result 
from Khatri [20] we have that 5'i5'][S'i2 = S 12 and hence iSi(a) = S'iS'][5i(a) 
(see King [20]). Note that the operator 5i(a) is self-adjoint since 






[(j)lj ^Ifc] 


j^i f,^i (^0 + «)(^2fc + a ) 

= {Si + aI)-^Si2{S2 + aI)-^S2i = Sl{a). 


Furthermore, as S '12 is a factor in 5i(a), the operator is Hilbert-Schmidt and 
hence admits an eigenvalue-eigenvector decomposition 

00 

‘5i(a) =J2p^j{o^) [/i(a) /j (a)] ■ 
i=i 

Now the question becomes how can an operator whose domain and range are 
subsets of Hi, approximate an operator whose domain and range are subsets of 
H{Ki). The answer to this question was fundamentally answered by Nasheed 
and Wahba [27] when it was proved that the collection of functions in H{Ki) 
is the same as Im(S'i)^/^, except with alternate norm and inner product. As 
the collection of eigenfunctions {fj{a)}^i reside in Im(S'i)^/^, they also have 
“dual citizenship” in H{Ki). We may therefore regard the eigenfunction se¬ 
quence {fj{a)} as residing in H{Ki), provided that we norm the eigenfunctions 
correctly. If we treat the eigenfuctions fj{a) as citizens of H{Ki), for notational 
consistency we will denote them by fj (a) with {/j (a) = fj ( 0 )}*™*'^'^^^“^^ There 
are therefore two possible views one may adopt concerning the operator 5i(a): 

(i) In the first view of 5i(a), we treat the operator as a self-adjoint mapping 
in ker(S'i)^ C L^{Ei). 

(ii) In the second viewpoint, the operator is treated as a self-adjoint mapping 
on H{Ki) with 5i(a) regarded as “two perturbations” distant from the 
the operator TT *, which is its ultimate intended target of approximation. 

When the second viewpoint for 5i(a) is adopted, the operator 5 i(q!) is repre¬ 
sentable by the H{Ki) based operator 

Si{a) = TiSl'^^ Si2{S2 + aI)-^S2i{Si + aI)-^S\'^T^\ 
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We then see that as a I 0, 


rr%(a)ri = sl^^^Si2iS2 + aI)-^S2i{Si+aI)-^Sl^^ 


= EE 

j=i k=i 
oo oo 

- EE 

j=l fc=l 




(Aij + a)(A2fc + a) 


[4>ii '5^ifc] 




(Aij)(A2/c) 


[01i 4>lk\ 


= r^^TT*ri. 


Therefore, since 5i(a) is Hilbert-Schmidt, Theorem 4.1 ensures that as the 
regularization parameter a 0, 


||rr'5i(a)ri - rr'TT*ri||«, = ||5i(a) - ^ 0. 


6. Asymptotic Properties for Tikhinov Regularization 

In this section we will consider the asymptotics associated with the sample 
estimators of the operators 5 i(q;). The asymptotics associated with the operator 
5i(a) rely heavily on perturbation theory concepts discussed in Dauxois et al. 
[9] as well as delta method theory for random operators discussed in Cupidon 
et al. [7]. 

To begin, we suppose that a random sample Ai, A 2 ,... A„ of independent, 
identically distributed copies of A S L‘^{E) are observed. The sample estimator 
associated with the covariance operator of A is given by 

1 " _ 

= - V (A, - A) (g)« (A, - A) (43) 

n ^' 

i=l 

and the continuous mapping theorem along with the law of large numbers en¬ 
sures that Sin = Si for 1 = 1,2 and Si 2 n = '^iSn '^2 = S^in 'S '12 

as n —>■ 00 . For Tikhinov regularization we will have need of the function 
(fiaiz) = {z + a)~^, which is analytic for all points in the complex plane, except 
for a pole at 2; = —a. Consequently, the disk D = {z G C\ mino<a;<||s|| I 2 : — x| < 
contains the spectra of S and the function ip a is analytic on TA. It follows 

by the continuous mapping theorem that PaiSin) —^ Pa{Si) as n —>■ 00 for 

i = 1,2. 

As a consequence of the continuous mapping theorem and the central limit 
theorem for Hilbert space operators (see Dauxois et al. [9]) we have that 

^{T,SnT, - T,ST,) = - S,j) A T,A7T, = M, (44) 

where, for = 1,2, Nij G lCHs{'Si,'Hj) is a Gaussian random operator that 
has mean zero and variance 

E,, = E{(A, (g)«. A, - Si) ®HS (X, X, - S,)} (45) 
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and Nii = Ni- Furthermore, by the delta method result from Cupidon et al. [7] 
it follows that for i = 1,2 


V^|<Pa(<Sm) - 7’a(<S'j)| (46) 

where the limit in ICnsi^i) has zero mean and is distributed as 




H- a) ^PikMiPik 

/c=l 


E 


1 

{Xik + 0'){Xij + a) 


P^J^^^Pik 


(47) 


with {Xik, Pik^^^i, the eigenvalues and eigenprojection operators corresponding 
to S'i, z = 1,2 (see Appendix). 

The sample version of the operator 5i(a) is then defined by 


Slnict) = Si2n{S2n + Oil) ^ S21n{Sin + Oil) (48) 

The asymptotic analysis of y/n(Sin{oi) — 5i(a)) follows from a product rule 
application of the delta method similar to that in Cupidon, et al. [7]. In this 
regard we introduce the following Gaussian elements in the set JChs{P{Ki)) of 
Hilbert-Schmidt operators on HiKi) 


Gllict) = Xfl2‘PaiS2)S21^aiSl), 

^ 12 ( 0 ) = Si2^p'a{-Xf2)S2l‘Pa{Sl), 

Gisioi) = Si2^Pa{S2)-Xf2l‘Pa{Sl), 

Gl4{a) = Si2‘Pc.{S2)S2M^i), 

4 

Gi{oi) = '^Gik{a)- (49) 

k=l 

Corollary 6.1. //E||A||| 2 (£;) < 00 , then as n —)■ 00 

y/n{Sin{a) - Si{a))Gi{a). (50) 


Proof: Define 

Aii{q) 

. 412 ( 0 ) 

^ 13 ( 0 ) 

Ai4{a) 


Sl2n — S 12 


iS2n)S21n^a{Sln), 


^aiS2n) 7^a(*5*2) S21n^a{Sin), 


Sl2‘Pc.{S2) 


S21n — S 21 




Sl2‘Pa{S2)S21 IfaiSin) — (fa{Si) 
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Notice that the difference '\/n(‘^in(Q;) — (Si(a)) can be expanded so that 

4 

i=i 

The application of (44), (46) and Slutsky’s Theorem then ensure that 


^(5in(a;) - (Si(a)) = y/n 


4 

i=i 


li 


(a) 


Qi{a) 


since, for example, the term Aii{q) consists of the factor 
Ni 2 right-multiplied by the factor 


<5'l2n — <S'i2 


^a(*^2n)‘^21nV^a(‘^ln) ^ (*5’2)‘5'21 ('5’l) . 


❖ 

As a result of Corollary 6.1, we see that iSi„(a) is a consistent estimator of 
5i(a) as 

||5i„(a) - Si{a)\\ = Op{n-^/^) ^ 0. (51) 

However, note that as long as the regularization parameter a > 0, ||5i„(a) — 
TT*\\ 0. In fact, by the triangle inequality we have 

||5i„(a)-rr*|| < |i5i„(a)-5i(a)|| + ||5i(a)-TT*||. (52) 


The first term on the right-hand side of (52) can be viewed as a random error 
that originates from using a sample estimator of iSi(a). This term tends to zero 
as n —> oo by (51). On the other hand, the second term on the right hand side 
of (52) is a deterministic error that arises from the regularized approximation 
of TT*. This latter term will only become negligible if a 1 0. 

Since the limiting distribution for ^yn{Sl„{a) — 5i(q;)) has been established, 
we may establish the limiting distributions associated with sample estimators 
for the regularized canonical correlation and associated projection opera¬ 
tor and weight functions. The quantities of interest are y/n {pknict) — Pkict)}, 

Vn |Pifcn(a) - Afc(a)| and y/n |//c„(a) - //c(a)| where {pkia), Pikia), /fc(a)} 

denote the eigenvalues, eigenprojections and eigenvectors for 5i(a) and {/5fe„(a), Pikn{oi), /fcn(<a)} 
denote the same for 5i„(a). 

Theorem 6.1. Suppose that E||< oo. Then, as n — > oo 


y/n \^Pikn{a) - Pikia)^ Pik{a)Qi{a)Qik{a) + Qik(a)Qi{a)Pik{a) (53) 
where Giia) is as in (49) and 


Qifc(a) = E 




Pj{oi) - Pkia) 


Pijia). 
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In the case that rank(Pife(a)) = 1, 

Vn |/fc„(a) - //c(a)| Qik{a)Qi{a)jk{a). (54) 

Proof: For each fc € Z, let F^ denote a circle that encloses the eigenvalue Pk{o) 
but no other eigenvalue eigenvalue of iSi(a). It follows from developments in 
the appendix that 

'/n I Afcn(Q:) - -Pifc(a)| = ^ ^ R{z){Sin{a) - Si{a))R{z)Hn{z, a)dz (55) 

with Hn{z,a) = ~resolvent of 

5i(a). Now since the integrand in (55) can be expanded into 

R{z){Sin{a) - Si{a))R{z)Hn{z,a) = R{z){Sin{a) - Si{a))R{z) + M{z,a) 

with 


M{z,a) = i?(z)(5i„(a) - 5i(a))i?(z) ^ |(5i„(a) - 5i(Q:))i?(z)| 

i=i 

= i?(z)(5i„(a) - Si{a))R{z){Sin{a) - Si{a))R{z) 

+ R{z){Sinia) - Si{a))R{z){Sinia) - Si{a))R{z){Sinia) - Si{a))R{z) H- 

= Op{n~^). 

It follows that 

V^{Afcn(a)-Pifc(a)} = R{z){SUa)-Si{a))Riz)dz + Op{n-^/^). 

(56) 

We may now focus attention on the lead term in (56). From Corollary 6.1 and 
the continuous mapping theorem it follows that 

Vn ^Pikn{o:) - Pikia)'^ ^ ^ R{z)gi{a)R{z)dz. (57) 


To simplify the last expression we write 

OO - 

Riz) = ——- Pik{a) + 0{{pk{a) - z)-^) 

f^^Pk{a)-z 

and all but the lead term will vanish when the contour integral is taken due to 
(96). The integrand in (57) can then be simplified as 


R{z)g,{a)R{z) = YY. 


{Pk{a) - z){pj{a) - z) 


Pik{a)gi{a)Pij{a). (58) 
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Applying the Cauchy integral formula to (58) ensures that 


dz 


ipi{a) - z)dz 


27 rz Jr, (pfe(a) - z){pj{a) - z) 27 ri Jr, (p/c(a) - z){pj(a) - z){p^{a) - z) 

and the only case where the integral is non-zero is when exactly one of Pkict) or 
Pj{a) is not equal to pi{a). When, for example, pkipi) = Pi (a) and Pi{a) J Pj{oi) 
we have 


1 


{P^ 


-dz = 


1 


27ri Jr, (Pfe(a) - z){pj{a) - z){pi{a) - z) {Pj{a) - Pk{a)) 

and hence 

^ik 


Vn^Piknia) - Pifc(a)| 


W- 


Pii{a)gi{a)Pij{a) 


EE 


Ojk 


(pja) - Pk{a)) 


Pii{a)gi{a)Pij{a) 


j=l i^k 

= Pik{.o:)gi{a)Qik{a) Qik{cx)Gi{a)Pik{a) 

which establishes (53). 

To obtain the limiting distribution of \/n ^fkn(o:) — /fc(a)| , first observe 
that an application of Theorem 10.1 ensures that for large n and probability 
tending to 1, rank(Pifcn(Q!)) = 1. Thus, we may write J’ifc„(a) = fkn{a) fkn{o) 

and hence 

(-Plfcn(^) Plk{,^y)~ {,Plkn{p^\ PlkisP))}{g 1 

= {{fkn{a) /fcn(a)), (/fc(a) fk{ot)))HS - 1 

= {^{fkn^Ot) — fk{a), • 

Furthermore, we note that 

vj|/fe«(a) -/fc(a)| = ' 


-Pifc(a)j |/fcn(a) - /fc(a)| 

/ - Pifc(a)j |/fcn(a) - /fc(a)|. (59) 


Focussing on the first term in the right hand side of (59) we see that 

'/n Pife(a)j |/fen(a) -/fe(a)| 

= vJ(/fe«(a) -/fe(a),/fe(a))H(ifi) /fe(«) 

Vn (^{Piknia) - Pifc(a),Pife(a))^5^ 


((/fen(a), /fc(a))«(ifi) + 1 


•fkia). 


(60) 
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Due to the continuity of the inner product and (53) it follows that 


\/n (Piknia) - Pik{a),Pik{a))jjg 

{Pik{a)gi{a)Qik{a), Pik{a))fjg + {Qik{a)Gi{a)Pik{a), Pik{a)) 

= {Qik{a)Gi{a)Pik{o:)j + tr (j^ik{a)gi{a)Qik{a)Pik{a)j 

= (Qik{a)gi{a)Pik{a)"j + 0 

= tr (^Pik{a)Qik{a)gi{a)'^ 

= 0 (61) 
because 

Qik{a)Pik{c() = 'Y' •—- ^Pij{a)Pik{a) = 0 = Pik{a)Qik{a). (62) 


Consequently, the numerator in (60) converges in probability to 0 whereas 
the denominator ((/fcn(a),/fc(a))^(^^) + l) = (2 + C>p(n“^/2)). Slutsky’s 

theorem then implies that ^/n [Afc(a)] |/fcn(a) - /fc(a)} 0 and hence 

(/fen(<a))/fc(<a))p(p-j) ^ 1 - 

To address the second term on the right hand side of (59) we observe that 
as a consequence of Slutsky’s theorem 

y/n I - Pik (a) I fkn (a) - /fc (a) | 

= y/n I - Pik{a) fkn{o) 

\fn I - Pikia) Pikn{a) fk{ot) 


ifknio:), 

y/n I - Pik{a) Pikn{a) - Pik{a) fk{a) 


{fkn{0i), fk{ci))-}i{Ki_) 


d ^ 

I-Pikia) 

Pik{a)gi{a)Qik{a) + Qik{a)gi{a)Pik{a) 

= 

I - Pik{a) 

Qikia)gi{a) /fe(a) 


= Qik{a)gi{a)fk{a). 


(63) 


Equation (63) establishes (54) which completes the proof. <) 

We may now derive the limiting distribution for y/n [pkn(,Oi) — Pk{<P)\, where 
{Pkn{o), Pk{oi)} denotes the distinct eigenvalue associated with {5i„(a), 5 i ( q ;)}. 
In the following result, if Pk{oi) has geometric multiplicity dk then y/n [pkn{ct) — Pkioi)] 
will be regarded as a vector of dimension dk- 
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Theorem 6.2. Assume i/iat E||X|j^ 2 (^) < oo and the regularized canonical 
correlation, Pk{ot), has geometric multiplicity dk- Then, 


Vn[pkn{o) - Pk{a)] = Vn Afcn(a)5i„(a)Pife„(a) - pfe(a)Pifc(a) 

A Pik{a)gi{a)Pik{a) 


with Giict) the Gaussian random variable in (50). Furthermore, Pik{c()Giia)Pik{a) 
has dimension dk and, in the special case that d/c = 1, 


Vn{pkn{ot) - Pk{a)) 


N{Q,(Tkk{a)) 


where N{Q,akk{ot)) denotes a normal distribution with zero mean and variance 


crfcfe(a)=E {fk{a),Gi{a)fk{a)) 


2 


Proof: Let Pk{ct) denote the fc*'* distinct eigenvalue of 5i(a) and assume that 
it has multiplicity dk- As \\Piknioi) — Pik{a)\\ 0 , Theorem 10.1 ensures 

that for n large enough, rank(Pifc„(a)) = rank(Pife(a)) = dk with probability 
tending to 1. Now observe that 


Vn[hn{a) - pk{a)] = ^/n Pikn{a)Sin{a)Piknioi) - pk{a)Pikia) 


i=i 


where 


— \_Plkn(,^) Plk(,^)\Gi,.i(^Cx')Pikni^) : 
Bk2{a) = Pik{a)[Sin{a) - Si{a)]Pikn{a), 
Bkz{ct) = Pik[a)Si{a)[Pikn{oi) - Pik{a)\. 


Equations (62), (53) and Slutsky’s theorem then ensure that 


< 


\\Bki{a)\\’)js —II Qik{a)Gi{a)Pik{a) + Pik{a)Gi{oL)Qik{a) 5i(a)Afc(Q!)|| 
\\Qik{a)Gi{a)Si{a)Pik{a)\\)js + \\Pik{a)Gi{a)Qik{a)Si{a)Pik{a)\\]js 
\\Qik{o^)Gi{o:)Si{a)Pik{a)\\^fjg + \\Pik{o^)Gi{a) Qik{c^)Pik{ct) '5i(a)||^g 

tr (^Pik{a)Si{a)Gi{a)Qlf,{a)Gi{a)Si{a)Pik{a)j + 0 
tr(^5i(a) Pik{a)Qik{a) Gi{,a)Gi{a) Qik{a)Pik{a) 5i(a)^=0. 


2 

HS 
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Similarly, 

\\^k3{a)\\%s \\Pikia)Si{a) Qifc(a)C/i(a)Pifc(a) + Pikia)Giia)Qik{a) 
— II‘^i(q^) Pik{o:)Qik{o!) ^/i(a)Afe(a)||^5 + IIA/c(«)‘5i(a)0i(a)(5i/c(a)||^5 

= 0 + tr (Qik{a)gi{a)Si{a)Pik{a)Si{a)gi{a)Qik{a)^ 

= tT(gi{a) Qik{a)Pik{c() Si{a)Pik{a)Si{a) Pik{a)Qik{a) = 0 . 

Hence Corollary 6.1 and Slutsky’s Theorem ensure that 

y/nBk2{a) Pik{a)gi{a)Pik{a) 

which proves the first part of the theorem. 

To see the validity of the second part of the theorem, assume that dfc = 1 
and observe that 


111^5 


Vn {pkn{a) - Pk{a)) = Vn ■ 




where 


Cfcl(cK) — ilfknio^) fkio:)], Sin{o:) fknio^)) 

Cfc 2 (ck) = {fk{(pi [‘5ln(Q!) kn{(P})'kl{^Kx)'' 

Cksia) = {fkia),Si{a)[fknia) - /fc(a)])«(Kj)- 

Note that Cki{a) 0 and Ck 3 (a) 0 as a consequence of equations (54), 
(62), and Slutsky’s theorem since 


Cfci(a) 


(Qife(a)0i(a)/fe(a),5i(a)/fe(a))^(^^ 
Pk{a) 


and, 


Cfc3(a) 




—^ (/fe(a),‘5i(a)(5ife(a)^i(a)/fe(a))^(^^) 

= xx (Aj(cr)/fc(a).gi(«)/fc(Q))?^(iCi) = 0. 

^ (Pi(a) - Pfc(a)) ^ 


Application of Theorem 6.1 and Slutsky’s Theorem then ensures that Ck 2 {oi) 
{fk{a),gi{a)fk{a))^^j^^y Which completes the proof since 


E 


{fkipi)^ 0l(c>^)/fc(c>^))'H(iCi) 


= 0 
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and hence 


Var 




= E 


{fk{a),gi{a)fk{a))'^^j^^^ =akk{a). 


0 

Now as the Gaussian operator Giia) is Hilbert-Schmidt, we note that the vari¬ 
ances <Jkk{o) i 0 as /c t 00 . 

Of natural interest is the degree of correlation between the regularized cor¬ 
relation estimators {pknict), pjn{a)} with j ^ k. To investigate this association 
let us take the simple case where for j ^ k, the multiplicities dk = dj = 1. 
Then, 


o-jfc(a) = Cav[pkn{a),Pjn{a)] 


= E 
= E 


{fknioi), Sin{a)fkn{oi))-^f^j^^^{fjn{o'),Sin{a)fjn{a))-^f^j^^^ 

(Siyi(o;) ^HS (.fjnikx) ®'H{K-i) fjniS^^Y)HS 


= {{fk{a)<^HiKi) fk{a)),E Sin{a)(^HsSin{a) (/j(a) /j(a)))jjs 

= ((/fc(a) G«(Ki) /fc(a)), Ei(a)(/j(a) fjia)))HSi = Pi(«)]tfc 

where Si(a) = E 5i„(a) (^hs '5i„(a) , and [Ei(a)]jfe is the {j, element of 

El (a). These developments suggest that the and regularized canonical 
correlation estimators are not necessarily independent. 

There are many similarities between the Tikhinov regularized version of 
canonical correlation analysis discussed here and those discussed in Cupidon et 
al. [6]. However, it is important to distinctions between Cupidon et al. [6] and 
the method discussed here. Firstly, in Cupidon et al [6] the regularized operators 
discussed were of the form (5'i-|-a/)“^/^S'i2(5'2-l-a/)“^<5'2i(*S'i-|-a/)“^/^ whereas 
in our approach they are Si 2 {S 2 +oiI)~^S 2 i{Si+aI)~^. Secondly, in Cupidon et 
al. [6] the operator approaches RR* as the regularization parameter approaches 
zero, and in our approach it tends to FT*, an RKHS based operator which has 
well posed solutions on a closed domain. Finally, the variance in the asymptotic 
distribution of 5i2(.S2 -l-a/)“^52i(.Si +al)~^ is the sum of 4 terms whereas the 
variance of (5i -I- aI)~^/^Si 2 {S 2 + al)~^ S 2 i{Si + al) involves 5 terms. 


7. TSVD Regularization Approach 

In the Tikhinov approach to regularization, the operators {<5'i,S'2} are re¬ 
placed with the operators {(S'! -I- al), {S 2 + al)} to obtain invertible operators. 
By contrast, the truncated singular value decomposition (TSVD) method of 
regularization replaces the compact operators {S'!, S' 2 } with the finite rank (and 
rank-deficient) operators 

Si{a) = ^ Xii(j)ii^L^(Ei) (l)ii and 

Xii><y 

S2{a) = X2i4>2i ®L'^{Ei) 4>2i- 

A2i>Q! 
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Let us now define mi{a) = of Aii > a} with similar definition hold¬ 
ing for 7712 ( 0 ). To ensure the equal dimensionality of the truncated versions 
of and S 2 , it is advantageous to re-parameterize TSVD regularization in 
terms of { 7771 ( 0 ), 777 . 2 ( 0 )}, rather than a. In this regard, for simplicity we will 
always take m = 777i(o) = 7772 ( 0 ). Notice that, under this re-parametrization, 
the compact operators and S 2 are replaced by the finite dimensional opera¬ 
tors S'i(777) = S'ini(777) and S 2 {m) = S 2 ^ 2 {'m'), where 111 ( 777 ) = 

TI 2 (777) = are the projection operators associated with the largest m 

eigenvalues of ^i and S 2 (or cumulative projection operators). Much like o in 
Tikhinov regularization, the truncation parameter 777 is the regularization pa¬ 
rameter. In TSVD we are interested in the case where m = 777(0) —>■ 00 which 
occurs when o { 0. However, it is important to mention that TSVD regular¬ 
ization is widely used in statistical practice, as it is common for a practicing 
statistician to discard right and left eigenvectors corresponding to small singular 
values after looking at, for example, a scree plot of the singular values. 

Development of the theory associated with the TSVD version of regularized 
canonical correlation analysis can now proceed along lines that are parallel to 
the developments in Sections 5 and 6. Accordingly, let us define the operators 
R{m) :H 2 ^Hi and T{m) : 'H{K 2 ) ^ HiKi) by 

R{m) = Si2S2{my^^^ (64) 


and 

T{m) = ri,Si(777)i/2t^^2^2(^)i/2tp-i_ (65) 

As all operators in (64) and (65) are bounded and «S'i 2 is Hilbert-Schmidt, both 
R{m) and T{m) are Hilbert-Schmidt. Also, as the regularization parameter 
777 —>■ 00, both Hi (777) and n2(777) converge to the identity. Thus, by Theorem 
4.1, R(rn) converges in operator norm to the operator rj”^Tr 2 and the continuity 
of Ti and r 2 then entail that 

T(777) = rii?(777)r^^ —7> T as 777 —>■ 00 


with convergence in operator norm. 

Now suppose that R{m)) singular sys¬ 

tem for R{m) with 


R{m) = 




i,j=l 


liA2j 


= ()>2j 4’li = 


rank(7?(m)*i?(m)) 

E 


Pjim) ®-H2 fjim)]. 


Then, 


T{m) = rii?(777)r^i 


rank( i?(m) * i?(m)) 

E [{^29j{m)) (g>u(K2) 


rank(T(m)*T(m)) 

gj{m) (^HiK2) fjim) 

7=1 


E 
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with {/j(m) = Tifj{m),gj{m) = j^g^g 

general rank(T(m)*T(m)) < m. Now by (16) and because {(j)ij}jLi are CONSs 
for \ili(rn)'Hi\, for i = 1,2, it follows that the canonical weight functions in 
%{Ki) and 'H{K 2 ) may be written by 

m m 

= X! 5i(w) = ^ X2i9ji{m)(j)2i 

with 

fjz{m) = {fj{m),cj)ii)^^ and gji{m) = {gj{m),(j) 2 j)^^. 

The corresponding regularized canonical variables in and are 

Uj{m) = «'i(/j(m)) = and 

Vj{m) = «'2(5 j(to)) = J2ZlgJ^im){X2,(|)2^)-H2■ 

The TSVD parallel to Theorems 5.1 and 5.2 from Cupidon et al. [6] also 
hold. 


Theorem 7.1. For any / € "Hi and g S 7^2, with / = ri(/), g = T 2 {g) 

0 < p^jim) t Pj < 1, as TO —)■ 00. (66) 

Proof: The convergence is from below since 

lir(TO)|| = ||ni(TO)rn2(TO)|| < ||ni(TO)|| ||r|i ||n2(TO)|| < ||r||. 


To see that (66) holds, fix j > 1 and observe that as to f oo 

[p] - p2(m)) = \p] - p]{m)\ < \\TT* - T[m)T*{m)\\ i 0. 


❖ 


Theorem 7.2. Let {/^ (to), Pj(to)} = {Ti{fj{m)),T 2 {gj{m))} G {H{Ki),H{K 2 )} 
denote the regularized weight functions corresponding to the TSVD version of 
canonical correlation analysis. Then, as m ^ oo for j = 1,2,.. 

ll/jM - /j|l«(ifi) ^ 0 and \\gj{m) - gj\\-H(K 2 ) ^ 0- 


Proof: The proof here parallels the one for Theorem 5.2. The idea is that 
since |jT(TO)T*(TO) — TT*|| —>• 0 as to —)■ oo, this implies that for any j G Z, 
the corresponding eigenprojection operators \\Pj{m) — PjWhs 0. If we now 
assume, WLOG, that (/j (to),> 0, the relation 


\\Pj{^) ~ PjWhs 


= - Pj,Pj{m) - Pj)Hs 

= 2 — 2{Pj{m), Pj)jjg 

= 2 - 2{{fj{m) ^HiKi) /jM). ifj ®nKi) fj))HS 

= 2 — 2(/j(to), 

= 2(1 — (/i(TO),/j)^('^^))(l + (/j(TO), 

= ll/iM - 

^ WfJ^y ~ hWuiKx) 
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implies that 


II/iM - /jllw(ifi) < II- PjWhs 0 

as the regularization parameter m f oo. <0> 

Theorem 7.2, along with the continuity of the mappings 'f'l and ^' 2 , ensures 
the convergence of the regularized canonical variables Uj{m) — 4'i(/j(m)) and 
Vj{m) = to the true canonical variables 'hi(/j) and d> 2 ( 5 j), as m —)■ 

00 . 

Let us now discuss the computation of the singular value decomposition 
of T{m). To accomplish this it suffices to consider the eigenvalue-eigenvector 
decomposition of T(m)T* {rn). This is the finite rank operator given by 

TiM = T{m)T*{m) =TiR{m)R*{m)T-^ 

= rini{m)Siim)^/^^Si2S2{m)-^S2iSi{m)^/^^Ui{m)r^\ (67) 

As was true for the Tikhinov case, problems arise from the presence of the 
unknown Ti in T(m). However, unlike Tikhinov regularization the operators 

involved are finite rank. Note that in the finite rank case Im(5'y^) = Im(S'|^^) = 

ker(S'i)-*- and we may substitute rini(m) with directly. Upon direct 

1 /2 

substitution of S-^ for Ti in (67) we obtain the operator 

Si(m) = IVi{m)Si 2 S 2 {rrL)'^ S 2 iSi{m)^ (68) 


which is a mapping from "Hi into 'H 2 - Much like its Tikhinov cousin, the operator 
Si{m) is self-adjoint since 


‘5i(to) = ^ 


Additionally, since the operator is finite rank, it is Hilbert-Schmidt and admits 
the eigenvalue-eigenvector decomposition 

rank(»Si(m)) 

Si{m) = Ifji'm) /jM] 

i=i 

with {p^{rn), the eigensystem for Si{m). 

All of the themes discussed in Section 2.1 are still applicable here. For 
example, since the eigenfunctions {fj{m)} are in Im(S'^/^), they belong to both 
R.{Ki) and Hi. If the eigenfunctions are considered to be elements of R^Ki) 
we will notate these as {fj{m)} with {fj{m) = fj{m)}. We will now show that 
as m t 00 

r(-^5i(m)ri ^ r(-iTT*ri. 
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To see this notice that 


rr^5i(m)ri 


E 


i,i=l 



XliX2j 


(^izTi) ®Ui (hi ^^ij) 


E 


i,i=l 


(Aii)(A2j) 


[0H r/)!,] = rr^TT*ri. 


Therefore, since Si{m) is Hilbert-Schmidt, Theorem 4.1 ensures that as the 
regularization parameter to —>■ cxd, 

lirr'^iMTi - rr'TT*ri|i„, = ||5 i(to) - Tr*||«(^,) ^ o. 

If Si{m) is regarded as an operator on 'H{Kx), we let {p^j{m), fj(m)} denote 
the eigenvalue and eigenvector pairs for the operator and using this notation 


Si{m) 


rank(<Si(m)) 

E p'M) 


i=i 


fj{m) /iM 


8. Asymptotics for the TSVD Operators 

In this section we will discuss the large sample distribution and consistency 
of sample versions of Si{m). The obvious estimator for this quantity is given 

Slnim) = f^ln{m)Si2nS2n{m)'' S21nSln{m)'' (69) 

with and 5 j„(to) 1 = (,§i„n„(TO))'l' = (5i„)'l'ni„(TO) for 

i = 1,2. By considering each factor associated with the TSVD operators in 
equation (69) it is clear that the asymptotic distribution will differ from its 
corresponding Tikhinov counterpart. We begin our analysis by assuming that 
the joint process X has zero mean and E[|| A||^ 2 (£)] < oo. Accordingly, we need 

to develop the asymptotic distribution of IIi„(TO) and Sin{m)^ for z = 1,2. 
Corollary 8.1. Provided that E[|| A|j'^ 2 (£;)] < oo, then for i = 1,2 and to > 1, 

y/n {flin{m) - n,(TO)) Ai{m) = E E + E E 

j>m k>m 

k>m j>m 

= [I- n,(TO)] + g.jM-Pz,)j [/- n,(TO)] (70) 

where Afi is a the distributional limit of ^/^{Sin — Si). 

Proof: From Dauxois et al. [9] we know that for z = 1, 2 

^/n j- ^ Pik-XfiQik 4“ Qik-XfiPik (^1) 
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where {Pikn, Pik}i-i, denote the eigenprojection operators for associated with 
the fc*'* largest eigenvalues to and 


Qik — 


j^k 


P 


ij ■ 


For the asymptotic distribution of the cumulative eigenprojection operator Ilin{m) 
Pijn, we notice that for all m > 1 the cumulative sum of the first term on 
the right hand side of (71) is 


Pij-^iQij— y]] Pijj^i 


j<m 


j<k 




^ik P 


-P.l 


and involves terms like 


0 

Pi2PiPil 

Ail —Ai2 


Ail-Aim 


PilMP.2 

Ai2 — Ail 

0 


PimA/'jPil 

Ai2 — Aim 


PilMPi3 

Ais —Ai2 


PimMiPl3 


Pil^J'iPirr^ 

p^kl. 

Aim-Ai2 


Hence for any term with j,k < m and j ^ k the upper triangular terms 
(UTT) involve and the lower triangular terms (LTT) are = 

-{P,^M.P.,r gQ ^iTT = -UTT*. Since 


\/n(n„(m) - nj(TO)) ^ QijMiPij (72) 


and (PijAfiQij)* = QijMiPij, the lower triangular terms in the first summand 
will cancel with the upper triangular terms in the second summand for all in¬ 
dices i,j < m. Equation (72) then telescopes and produces the following new 
asymptotic result 

^/n (Hi (to) - Hi (to)) y^ PijNiPik + X! y] 

j>m ^¥^3 j>m ^¥3 

k>m. k>m 

= [I - ni(TO)] I '^{PijNiQij T QijNiPij) j [^ - ni(TO)]. 


It is important to note that Corollary 8.1 has applications not just to canonical 
correlation analysis but also to principal component analysis. 

Now, consider the asymptotic distribution of Si{my = (.Si„IIi(TO))'^ for some 
TO > 0 and i = 1,2. In this regard, observe that the function F{z) = z~^ is 
analytic everywhere in the complex plane except for a pole at zero. Therefore F 
is analytic on the subset of the complex plane defined by Hi = {^ G C : Re( 2 :) > 
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Aim — e} with 0 < e < Xim- The set Di also contains the spectrum of Si{m). 
Consequently, by the delta theorem (see Cupidon et al. [7] and appendix) we 
have that 

^/n - Si{rn)^'^Bi{m) (73) 

for 1 = 1,2 where 

m 

Bi{m) = — Ay PijMiPij + 


i=i 


j,k<rr, 


\-i_ \-i 
Ph _ P I.M P 

^ ^ 2/C*' >'2-* 2J 


(74) 


The asymptotic analysis of ^/n(Sin(jn) — Si{m)) for j = 1, 2 may now proceed 
where the application of the delta method leads to a product rule development. 
For this purpose we introduce the following Gaussian Hilbert-Schmidt operators 

J'ii(m) = Ai{m)Si 2 S 2 {rn)'^ S 2 iSi{m)\ 

Ai2{'m) = Iii{m)Ni2S2{rn)'' S2iSi{rn)\ 

Ai3{m) = Iii{rn)Si2B2{m)S2iSi{rn)\ 

Pu{m) = Bi{rn)Si2S2{rn)''N2iSi{rn)\ 

= ni(m)S'i25'2(m)'l'S'2iSi(m), 

5 5 

Pi{m) = '^Pij(m) = ^Pij{m). 


(75) 




i=2 


The corollary below then results from the application of the delta theorem (see 
Cupidon et al. [7]). 


Corollary 8.2. //E[||X||| 2 (^)] < (X), then as n ^ oo, 
^/n{Sl{m) - Si{m)) Pi{m). 


(76) 


Proof: The proof follows along lines of the one for Corollary 6.1. Specifically, 
we begin by defining the elements 


Alii(m) 

Ai2{rn) 

Aisim) 

Aiiim) 


Ainim) - ni(m) Si2nS2n{m)^ S21nSlnim)\ 


= Hi (to) 


Sl 2 n - Si 


S2„{m)^ S21nSlnim)\ 


= Ili{m)Si2 S2ni'm)^ - S2{m)^ S2inSi„im)^, 


S2lr,.-S^ 


21 


Sin{m)\ 


Ui{m)Si2S2{my 
Ai5im) = Ui{m)Si2S2{m)'^S21 Sinim)^ - Siim)"^ 

With this notation we may write 


\Ai{Sin{m) - 5i(to)) = a/u 


1=1 
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The application of (44), (70), (73) and Slutsky’s Theorem then ensure that 


5 

i=i 


li 




Ti{m) 


since, for example, the term Aii{a) consists of the factor y/n — IIi (to) 

Ai{m) right-multiplied by the factor 


Sl2nS2n{.m)^ S21nSlnim)^ S'i25'2 (to)'I'5'21 S'! (to)'''. 


We can now show that ||J^ii(to)|| = 0 with probability 1. To see this note 
that Tii{m) is self-adjoint because it is the distributional limit of self-adjoint 
operators. Furthermore, as a consequence of Corollary 8.1 


J'ii(to) = yl.i(TO)S'i25'2(TO)'^S'2l5'i(TO)'^ 

= [/ - ni(TO)] ^i(to)S'i25'2(to)'''S'2iS'i(to)^' [III)??!)] 

= [/ - III {m)] J^ii (to) [Hi (to)] 

= [/ - ni(TO)] S'i(to)'''S'i2S'2(to)'''S'21^i(to) [ni(TO)] 

= [/ - ni(TO)] [ni(TO)] Si{m)^Si 2 S 2 im)'^S 2 iAi{m) [I - ni(TO)] [ni(TO)] 

= 0 . 


Thus, ||J^ii(to)|| = 0 with probability 1 and Ti{m) = X)j= 2 ("^)■ 
completes the proof. <) 

Corollary 8.2 ensures that for all to > 1 


|| 5 i„(to) - 5 i(to)|| = C>p(n ^/^) ^ 0 . 

Hence, iSi„(to) is consistent for Si{m). The triangle inequality reveals the as¬ 
sociation between errors which originate from having a sample estimator and 
using regularization to approximate the desired operator TT*, 

||5i„(to)-TT*|| < ||5i„(to)-5i(to)|| + ||5i(TO)-rr*||. (77) 


The first term on the right-hand side of (77) is a random error that originates 
from using a sample estimator of 5i (to) and tends to zero as n —> oo. Meanwhile, 
the second term on the right hand side of (77) is a deterministic error that arises 
from using a regularized approximation of TT* and will tend to zero as to f oo. 

Since the limiting distribution for ^/n{Sln{m)—Sl{m)) has been established, 

we may derive large sample asymptotics for {pij„(to), Pijnim), fijninT-)} where 
these quantities represent the eigenvalue, eigenprojection and eigenvector 
for Sin(rn). Let {pij{m), Pij{m), denote similar quantities for Si{m). 

We begin our development with the limiting distribution of the eigenprojec¬ 
tion operators and associated eigenvectors. 
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Theorem 8.1. Suppose E||X||| 2 (£;) < oo. Then, for m> 1 and as n —>■ oo 


V^|Afcn(w) - Plfe("l)} 

where Fi{m) is as in (76) and 

Qik{m) = ^ 


Pikim)Ti{m)Qik{m) + Qife(m) J'i(m)Pifc(w) 

(78) 


1 


Pijim) - pikim) 


Pik{m). 


In the case that rank(Pife(m)) = 1, then 


if- 


I m \ 


1 I m 


n1 d 


(79) 


Proof: The proof for the limiting distribution of Piknim) is identical to that 
presented for Tikhinov regularization in Theorem 6.1. The only difference is 
that the role of the parameter a in Tikhinov regularization is replaced by that 
of m in TSVD regularization. For the sake of completeness, we provide a sketch 
of the proof. 

For each A: > 1, let Pfc denote a circle that encloses the eigenvalue pik{m) 
but no other eigenvalues of Si{m). From developments in the appendix, notice 
that 


'/n ^Pikn[m) - Pik{m)^ = j> R{z){Sin{m)-Si{m))R{z)dz+Op{n ^/^). 

(80) 

Focussing attention on the first term on the right hand side of (80), it follows 
from the continuous mapping theorem that 

y/n \Pikn{m) - Pik{m)'^ ^ (*1) 


Since 


R{z) = E 


^ Pfe(m) -z 


Pikim) + 0{{pk{m) - z) ) 


(82) 


all but the lead term in (82) will vanish when the contour integral is taken due 
to (96). The integrand in (81) can then be simplified as 


Using the Cauchy integral formula produces 


Pik{rn)Pi {m)Pij (m). 


1 


dz 


1 


27ri Jr^ {Pk{rn) - z){pj{m) - z) 2-ni Jr^ {pkijn) - z)(pj(m) - z){p^{m) - z) 


(pi(m) - z)dz 
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and the only case where the integral is non-zero is when exactly one of pkim) 
or Pj{m) is not equal to pi{in). Hence 


Vn^Piknim) - Pi/c(to)| 


OO e 

Oik 

hih'k 


EE 


Ojk 


iPiim) - Pk{m)) 


Pii{m)Pi{m)Pij{m) 

Pii{m)Pi{m)Pij{m) 


j=ii^k 

= Pik{rn)Pi{m)Qik{m) + Qik{m)Pi{m)Pik{'m) 

which establishes (78). 

To establish the limiting distribution of the eigenvectors in (79) we write 


' ^fikn{rn) - /i/c(to)| 


/n 


Pik{m) |/ifc„(m) -/ife(TO)| 

(m) |/ife„(TO) -/u.(to)| .(83) 


I-P^ 


Now by using the TSVD analogues to equations (60) and (61) we may see that 
the limiting distribution for the first term on the right hand side of (83) is 0. 
For the second term on the right hand side of (83) we have 


I - Pik{m) |/ife„(TO) - /ife(TO)| Qij{m)Pi{m)fik{m), 


which completes the proof. <) 

We will now derive the limiting distribution for ^/n[plkni'rn) — pik{m)], 
where {pikn{iTi), pikim)} denotes the distinct eigenvalue associated with 
{Sin{m),Si{m)}. Much like the Tikhinov case, the quantity ^/n [piknpn) — Pifc)™)] 
will be regarded as a vector of dimension equal to the multiplicity, dk, of the 
eigenvalue pik{m). 


Theorem 8.2. Assume t/iat E||X||^ 2 (^) < oo and the regularized canonical 
correlation, pife(m), has geometric multiplicity dk- Then 


'/n[pikn{m) - Pikim)] — Pikim)Piim)Pikim) 


(84) 


with Pi(m) the Gaussian random variable in (75). Furthermore, Pikim)Pi{m)Pikim) 
has dimension dk ■ In the special case that dj, = 1 

Vn ipiknim) - Pikim)) A^(0, Ukkim)) (85) 

where Ni0,akkim)) denotes a normal distribution with zero mean and variance 

CTkkim)=P (/ifc(w), J'l(TO)/lfc(TO))^(^^) . 
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Proof: Like before, the proof here naturally parallels the Tikhinov result pre¬ 
sented in Theorem 6.2. Since |jA/cn(w) — Pi/j(m)|| = Theorem 10.1 

ensures that for large enough n, rank(Pifc„(m)) = rank(Pifc(m)) = dk with 
probability tending to 1 as n —)■ oo. Now let us define 

i’kii'm) = [Pikn{m) - Pik{'m)]Sin{m)Pikni'm), 

Vk 2 {m) = Pik{m)[Sin{rn) - 5 i(m)]Pifcn(m), 

Vkz{m) = Pik{m)Si{rn)[PikrL{rn) - Pifc(w)], 


and note that 


'/n [piknim) - pik{m)\ = ^/n 
Note that Vki{m) —^ 0 and Vksim) 0 since 




'^Vkj{m) 

i=i 


( 86 ) 


Qife(m)Pi(m)Pifc(TO) -I- Pifc(TO)Pi(m)(5i/c(m) Si{m)Pik{m)\\'^HS 
< ||Qifc(m)Pi(m)5i(TO)Afc(w)||p5 -k || Afc(m)Pi(m)Qifc(TO)5i(m)Pifc(TO)||ps 
= llQifc(w)Pi(m)5i(TO)Pifc(w)||p5-k ||Pifc(w)Pi(m) Qik{'m)Pxk{m) 5i(to)||ps 
= tr (^Afe(m)5i(TO)Pi(TO)(5ife(m)Pi(m)5i(m)Afe(TO)j -k 0 
= tr(^5i(m) Pik{'m)Qik{m) Pi(m)Pi(m) Qik{m)Pik{m) 5i(to)^ = 0. 


Similarly, 


I|2?fc3(w)|||^5 ^ \\Pik{m)Si{m) Qik{m)Fi{m)Pik{m) + Pik{m)Pi{rn)Qik{rn) 


< ||5 i(to) Pik{m)Qik{m) Fi{m)Pik{m)\\\js + \\Pik{m)Si{m)Fi{m)Qik{rn)\\]js 

= 0 -k tr |^(2ifc(TO)Pi(TO)5i(m)Pifc(TO)5i(m)Pi(m)(5ifc(m)^ 

= tr(^Pi(TO) Qik{m)Pik{m) Si{rn)Pik{rn)Si{m) Pik{m)Qik{m) Pi(m)^ = 0. 
Hence, Corollary 8.2 and Slutsky’s Theorem ensure that 

VnPk2{m) Pik{m)Pi{m)Pik{m) 

which proves the first part of the theorem. 

To see the validity of the second part of the theorem, assume that = 1 
and observe that 


HS 


Vn{pkn{m) - pk{m)) = y^ Ckjjm) 
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where 


Ckiim) = {[fknim) - /fe(m)],5i„(m)/fc„(m))^(^^), 

Ckiijn) = {fk{m), [Sin{m) - Si{m)] fknim)) 

Cksim) = {fkim),Siim)[fknim) - fkim)])^(^i^^y 

The terms Ckiim) —^ 0 and Cksim) —^ 0 as a consequence of equation (79) 
and Slutsky’s theorem, since 

Ckiim) {Qikim)Tiim)fkim),Siim)fkim))^f^j^y 

= E *(”■»«(&) =» 

and 

Ckzim) {fkim),Siim)Qikim)Tiim)fkim))^i^j^y 

= {Pijim)fkim),J^iim)fkim))^^j^y = 0. 

f^f^iPjim) - Pkim)) 

Application of Theorem 8.2 implies that 


VnCk2im) {fkim),Tiim)fkim))^f^j^y. 


Since 

and 

Var 


E 


{fkirn), J^iim)fkirn)) 


= 0 


{fkim),Tiim)fkim))^(^j^y =E {fkim),Tiim)fkim))\^j^y = akkim), 

the proof is then complete. <(> 

The TSVD versions of the correlation estimators {pinim), pjnim)} with i ^ j 
are correlated, much like the Tikhinov case. In fact, when the operator 
is simple, we have for i j 


aijim) = Cov[pinim), pjnim)] 


= E 
= E 


ifinim),Sinim)finim))-H(Ki), ifjnim),Sinim)fjnim))L^Ei) 

((/m(w) /m(TO)) , Sinim)'SlHSiSlnim) (/jn (w) /i« (w)) ) 

,E Sinim)'SiHSiSinim) {jjim) him) 

((/i(w) fiim)'^ ,Si(to) (^fjim) iSi-HiKi) fjim)^'^ 


HS, 


' HSi 
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with 


Si (m) = E 


f m ^ TT ,-1 f m i 


and [Si(m)]ij, the {i, element of Ei(m). 


9. Conclusion 

In Sections 5-9 we discussed how both Tikhinov and TSVD regularized es¬ 
timators approach their intended target of approximation, the RKHS based 
operator TT*, in the limits of their respective regularization parameters. We 
also showed that the asymptotics associated with Tikhinov and TSVD sample 
estimators {5i„(a), 5i„(m)} are similar in the sense that for every distribu¬ 
tional result for quantities relative to the Tikhinov estimator iSi„(q;), there is 
an analogous distributional result for its TSVD cousin 5 i„(to). The question to 
ask here is whether or not one form of regularization should be preferred over 
the other. 

The answer to this question lies in one critical flaw in the Tikhinov approach 
to FCCA, which up to this point has not yet been discussed. Although replacing 
the operators {S'!, S' 2 } with {(S'! +al), {S 2 + aT)} fixes the operators invertibil- 
ity issues, the operators still theoretically have infinite dimensionality. Infinite 
dimensional operators are problematic because no computer will ever be able to 
estimate all the eigenvalues and eigenvectors. On the sample side, since the op¬ 
erators S 2 n} have rank at most n, they are rank deficient. Meanwhile the 
operators {(S'i„-|-Q!/), (S 2 n + 0 'I)} will have infinitely many eigenvalues equal to 
a. Any pragmatic computational scheme where Tikhinov regularization is im¬ 
plemented would therefore involve some limit on the number of eigenvalue and 
eigenvector pairs to be used and estimated. As a consequence, FCCA methods 
will surely involve truncation. If we choose to implement Tikhinov regulariza¬ 
tion with truncation this will involve the operator 

m 

Sinia, m) = + a)Pij„ = {Sin + a/)nin(m) (87) 

1=1 

for some integer 1 < m < n. The estimator in (87) has some characteristics 
that are akin to both those of Tikhinov and TSVD regularization. Utilizing this 
“truncated Tikhinov” estimator it follows that the corresponding regularized 
estimator for TT* would be 


Sin{a, m) = nin{m)Si 2 nSln{a, m)S 2 inSln{a, m). (88) 

Equation (88) illustrates that pragmatic implementation of Tikhinov regulariza¬ 
tion in the FDA setting will in reality entail the use of both Tikhinov and TSVD 
forms of regularization. By contrast, TSVD regularization entails replacing the 
operators {S'i,iS' 2 } with {(5'ini(TO), S' 2 n 2 (TO)} which have finite rank. Conse¬ 
quently, TSVD regularization provides a remedy for both infinite dimensionality 
and invertibility issues simultaneously. 
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Since there are errors which originate from regularization methods in general, 
it is always better to use as few methods as possible. The triangle inequality can 
now be utilized to establish a bound on the error associated with the “truncated 
Tikhinov” estimator (87). In this regard, notice that 

||5i„(a,m) - rr*|| < ||5i„(q;,to) - 5i„(m)|| + ||5 i„(to) - rr*||. 

Hence the error associated with utilizing Sin{a,m) will always be larger than 
simply using S'i„(m). 


10. Appendix: Some Perturbation Theory 


In this appendix we briefly summarize some results from perturbation theory. 
The primary references for this section are Kato [22] and Dauxois et al. [9]. A 
typical problem in perturbation theory is to determine how the eigenvalues 
and eigenspaces of a linear operator B change when B is subjected to a small 
perturbation. Let A : i—t be an arbitrary perturbation operator and let 
B — B + A represent the perturbed operator. In this regard, we might think 
of A as being small in terms of its uniform operator norm ||A||. However, a 
measure of “closeness” between B and B which is often of greater importance 
is the aperture or gap between the graphs of the two operators. 

Let A4 and Af be two closed linear manifolds on H with S _\4 = {u S 
A4 I WuW-H = 1}, the unit sphere on A4. For any two closed linear manifolds 
M,Af C H let 


6{MM) 


sup„6SAi{dist(M,A/')} for M ^ {0}, 

0 AM = {0} 


with 

dist(M,A/') = inf {||u-i;||«}. 

vGJ\ 

The gap between M and Af is then defined by 

= max[5(Al, A/”), (5(A/’, Al)]. 

More details concerning S{M,Af) and 5{M,N) can be found in Kato [22]. 

If the graphs {G'(H), G(H)} of two operators {B,B} are closed, the closed 
graph theorem entails that both B and 13 are bounded. Consequently it is 
possible to define the gap between operators B and B by measuring the gap 
between their associated graphs. In this regard we define 

S{B,B) = 6{G{B),G{B)), 

S{B, B) = 5{G{B), G{B)) = max[^(H, B), d{B, B)], 

and S{B, B) = 5{B, B) is called the gap between B and B. 

The notion of the gap between operators plays a large role in perturbation 
theory. Suppose B and B are the original and perturbed operator respectively. 
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The smaller the gap S{B, B) becomes, the more properties the B inherits from 
B. Of particular importance is the following theorem from Kato [22] which 
permits the construction of closed curve T around a part of the spectrum of B, 
denoted S(B), that also encloses a similar collection of spectral points of the 
perturbed operator S(-B). 

Theorem 10.1. (Semi-continuity of the spectrum) Let B^B £ B{'H) and let 
the spectrum of B, S(i?), be separated into two parts Y/{B), E"(_B) by a closed 
curve r, with H = A4'(B) © M"{B). Then, there exists a S > 0, depending on 
r and B, such that if B is any operator with 5{B,B) < 5 

(i) the spectrum E(i3) are likewise separated by T into two parts E"(i?) 

and both {T,'{B), E"(i3)} are non-empty if this is true for {S'(i3), E"(i?)}, 

(ii) in the associated decomposition TL = {B) ® M."{B), {M'{B), Ai"{B f} 

are isomorphic with {Ai'{B),Ai”{B)}, respectively, 

(hi) dim(Al'(i3)) = dim(Al'(i?)) and dim(7W"(B)) = dim(Al"(i3)) and 
(iv) the projection operator Pg ofTL onto M'{B) tends to the similarly defined 
projection operator Pb in operator norm as S{B,B) —>■ 0. 

We will now develop formulae for the differences in the resolvents and pro¬ 
jection operators between the perturbed and unperturbed operator. In this 
regard, let R{z) = [B — zl)~^ and R{z') = {B — z'T)~^ denote the resolvents 
of B and B for some z G C \ E(i3) and z' G C \ E(i3), respectively. From Kato 
[22], if A/j G E(i3) is some isolated point of the spectra and P^ is the associated 
projection operator then 



(89) 


where F^, is a positively oriented curve that encloses Xk but no other spectral 
values of E(i?). 

Now, whenever ||(i? — B)R{z)\\ < 1 and z G C \ S(i?) we may utilize the 
Neumann series Theorem (Rynne and Youngson [33]) which ensures that 


i?(z) = (^{B - B) + {B - zl)) 


= ((B-i?) + (i?(z))-i) 

= R{z) (^{B - B)R{z) + l) 





.k-O 


= R(z){B - B)R{z)H{z) 


(90) 
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where H{z) = ~ ■ Another application of Neumann series 

theorem reveals that 


OO ^ 

H{z) = Y,{iB-B)R{z)} = [l - {B - B)R{z) 


k=0 


1 -1 


and hence 


R{z)-R{z) = R{z){B - B)R{z) I - {B - B)R{z) 


n -1 


Now let {Aj,Aj} be particular spectral values for {B,B}, and let {Pj,Pj} 
denote the corresponding eigenprojection operators. Now provided d{B,B) is 
small enough, Theorem 10.1 ensures that a positively oriented circle Tj, with 
radius r, can be drawn to enclose both Xj and Xj but no other spectral values 
of either B or B. As a consequence of (89) and (90) we then obtain 



1 

2TTi 

1 

2TTi 



R{z) - R{z) 


dz 


(1) R{z){B — B)R{z)H{z)dz. 


(91) 


Equation (91) allows us to formulate a crude bound on the uniform operator 
norm of \\Pj — Pj\\, specifically 


n -1 


Pj-Pj\\<7;zf \\Riz){B - B)R{z) I - (B - B)Riz) \\dz 


27r 


27rr 


= r sup ■ 


^l-\\B-B\\\\Riz)\\ 


Z G: T' j 


l-\\B-B\\\\Riz)\\ 


z G T' A 


(92) 


Another formula for 
of H{z) so that 


Pj - Pj 


can be derived by expanding the first term 


H{z) = I + Y,{{B - B)R{z)y . 


(93) 


1=1 


Plugging (93) into (91) gives 


P - P 


1 

27rz 


R(z)(B — B)R(z)dz -® M(z)dz (94) 

27ri 


where 


M(z) ^ R{z) Y,{-ARiz)y = 0{A^). 
1=2 
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Using the partial fraction expansion of the resolvent from Kato [22] it follows 
that 

oo I 

= E ( 95 ) 

j=i J ' 

Now in (95) the higher-order terms involving 0((Aj — z)~^) can be ignored due 
to Morera’s theorem since, for n > 2, 


(j) (Aj — z) '^dz = ^dw = 0 

“'U “'U 

where the substitution w = (Aj — z)~^ has been used. Thus, since 


27ri 


1 


Xj — z Xk — z 


1 f ^ 

dz = < 


if j, 

otherwise 


(96) 


it follows that 


Therefore 


R{z)AR{z)dz = ^EE 


27ri 

= E 

k^l 


k j 
1 

Aj Xk 


Xh z X^ 


-dzPkAPj 


{PkAPj -f PjAPk). 


Pj -Pj=y\ ^^^(PkAPj + PjAPk) + cj^{z)M{z)dz. 

k^j ~ "'r 


(97) 


Equation (97) has several important implications as it allows us to formulate 
the notion of the Frechet derivative of an analytic function of an operator. Now 
suppose that a function 4>{z) is analytic in a domain A of the complex plane 
containing all the spectral values {A/j,A/j} of {5,5}, with T C A a positively 
oriented closed curve that encloses all spectral values in its interior. Utilizing 
the Dunsford-Taylor integral for 4>{B) and 4>{B) (see Kato [22]) we obtain 

cj){B) - (l){B) = £(I){z)[R{z) - R{z)]dz 


1 

27rz jp 
1 


- (f (j){z)R{z)AR{z)dz -\—— (f (j){z)M{z)dz 
i Jy 2m Jy 


= 


27ri ^ ^ Jy (Afe - z)(Aj - z) 

K J 


dzPkAPj +0{AJ. (98) 


Focussing on the integral in the first term on the right hand side we see that 


(j){z) 


1 


27ri Jy (Afc — 2 :)(Aj — z) 


dz = 


(j)'iXj) iffc=j, 

4>i>'k)-4>i>'j) ;f _L „• 

A*,-A, It K 7^ J- 
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Equation (98) can then be written as 


- <^( 5 ) = ^ cl,'{\,)P,AP, + ^ iM^^hdp^AP, + OiA^). 

j>i 


Now, since = 4>{B) + (t>'gA + 0{A^), the Frechet derivative at B is 

feA = ^ t/{X,)P,AP, + ^ *A i ^— ‘t} ALp„AP,. (99) 


J>1 




\k Aj 


Equation (99) will be used extensively when we consider the delta method for 
functions of random operators. 
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